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ABSTRACT 


\ 


A general,  but  approximate,  method  is  suggested  to  determine  confi- 
dence limits  for  a product  of  independent  positive  random  variables. 
The  method  is  developed  and  the  accuracy  of  the  approximation  is 
discussed  in  one  of  its  applications:  the  Bayesian  confidence 

limits  for  the  product  of  N binomial  parameters.  Four  appli- 
cations (including  the  above),  in  which  the  factors  in  the  product 
are  probabilities  are  given  together  with  numerical  examples. 

These  are  from  the  antisubmarine  warfare  (ASW)  field  and  constitute 
to  the  evaluation  of  operational  data  in  that  field  and  statistical 
data  in  many  other  fields.  . 


PREFACE 


This  work  is  done  as  part  of  the  exercise  research  programme  at 
SACLANTCEN  and  supports  the  evaluation  of  measures  of  operational 
ASW  performance  collected  in  the  "SACLANTCEN  Compendium  of  NATO  ASW 
Exercises", 
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INTRODUCTION 


In  SACLANTCEN's  analysis  of  the  data  obtained  from  NATO*s  ASW 
exercises,  as  in'  similar  statistical  analyses  for  other  purposes, 
probabilities  are  frequently  used  as  measures  of  performance.  For 
example,  the  performance  of  shipborne  sonars  in  detecting  submarines 
can  be  measured  statistically  by  recording  the  frequencies  of 
detection  at  different  ranges  and  using  those  data  to  estimate 
the  probabilities  of  detection  at  selected  ranges  of  interest. 
However,  like  all  statistically-derived  values,  these  estimates 
i»«»e  ixttle  value  to  the  user,  and  may  even  mislead  him,  unless 
they  can  be  provided  with  some  indication  of  the  statistical 
uncertainties  resulting  from  the  inadequacies  of  the  basic  data. 

The  purpose  of  this  paper  and  of  a preceding  paper  [Ref.  1]  by  the 
same  author  is  to  present  a method  of  giving  approximate  confidence 
limits  to  such  estimates  of  probability. 

The  original  problem  [Ref.  l]  was  to  find  confidence  limits  for  the 
probability  that  a sonar-fitted  surface  ship  would  detect  a sub- 
marine by  its  sonar  outside  a given  range  from  the  ship  to  the  sub- 
marine. To  that  end  a sample  of  relative  tracks  between  the  ship 
and  submarine  was  collected,  including  notes  about  detections  and 
opportunities  for  detection  as  functions  of  rangd  between  ship 
and  submarine.  The  range  was  divided  into  brackets;  calling  C£ 
the  probability  of  detection  in  range  bracket  i,  the  probability 
of  detection  outside  a given  range  could  be  written  as 

i - n (l  - c. ) , 

where  the  product  was  taken  over  all  brackets  outside  the  given 
range . 

We  would  like  to  find  confidence  limits  for  that  expression,  given 

some  distribution  for  c.. 

l 

It  was  noted  that  this  problem  was  a special  case  of  a more  general 
problem:  to  find  confidence  limits  for  a product  of  independent 

positive  random  variables.  If  one  could  solve  this  problem  one 
would  not  only  have  solved  the  present  problem  but  many  other 
related  problems  involving  a product  of  random  variables. 

Considerable  work  has  been  done  in  this  area.  The  only  exact 
method  known  to  the  author  is  that  by  Springer  and  Thompson 
[Ref.  2],  which  assumes  a particular  distribution  for  each 
factor.  Other  methods  [Refs.  1,  3*  4]  are  both  approximate 
and  assume  particular  distributions  for  the  factors.  The  purpose 
of  this  paper  is  to  present  a method  that  does  not  assume 
particular  distributions  for  the  factors.  The  method  however 
is  only  approximate. 
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All  data  collectors,  who  are  trying  to  estimate  products  of  random 
(stochastic)  variables  on  the  basis  of  sampling  experiments  on  the 
individual  factors,  have  an  interest  in  confidence  limits.  As 
examples,  one  can  mention: 

a)  Estimation  of  sonar,  or  radar,  detection  proba- 
bilities as  a function  of  range  or  time. 

J 

b)  The  use  of  a product  olf  probabilities  of  detection, 
correct  classification,  localization,  and 
successful  attack,  as  a measure  of  effectiveness 
for  a tactical  process. 

c)  Estimation  of  failure  rates  (from  small  samples) 
cf  series,  parallel  and  series-parallel  systems 
(electronic  apparatus,  missiles,  aircraft,  etc.) 
in  reliability  and  quality  ^control  problems. 

The  author  believes  that  it  is  more  important  to  present  confidence 
limits  for  a measured  quantity  than  to  give  an  estimate  of  it. 

At  once  it  must  be  said  that  it  is  generally  more  difficult  to 
determine  the  confidence  limits  than  to  determine  an  estimate, 
because  confidence  limits  require  some  knowledge  of  the  distribu- 
tion of  the  quantity  whereas  an  estimate  often  does  not.  Confidence 
limits  contain  more  information  than  an  estimate;  having  confidence 
limits  one  can  always  find  an  estimate  but  not  vice  versa.  Confidence 
limits  are  important  when  comparing  two  measures,  such  as  results 
from  two  experiments  (in  the  form  of  figures  or  curves),  as  they 
can  give  a first  indication  of  whether  the  two  quantities  could 
stem  from  the  same  population  or  not  (however  a proper  test  has  to 
be  performed  in  the  end).  Estimates  for  the  two  quantities  give 
no  information  in  that  respect. 

This  paper  generalizes  the  determination  of  confidence  limits  for  a 
product  of  probabilities,  so  as  to  cover  problems  involving  a 
product  of  independent  positive  random  variables  in  general. 

Because  of  this  generalization,  the  problem  posed  by  the  present 
author  in  Ref.  1 is  solved  with  fewer  approximations. 

The  opportunity  is  also  taken  here  to  correct  an  error  in  Sect.  1.4 
of  Ref.  1,  which  should  read: 

Equation  1 to  be  used  as  an  exact  expression  in  both 
the  discrete  and  the  continuous  case. 

Equation  3 to  be  used  only  as  an  approximation  for 
the  easy  calculation  of  the  confidence  limits  in  Ch.  2. 

However  this  latter  approximation  is  no  longer  needed  due  to  the 
improved  method  presented  in  the  present  memorandum,  which  in 
fact,  should  now  be  regarded  as  a replacement  for  Ref.  1. 


1. 


GENERAL  METHOD 


1.1  Definition  of  the  Problem 

We  want  confidence  limits  for  the  random  variable  C,  which  is 
given  by 

N 

c = n c.  , [Eq. 

i=l  1 

where  is  a set  of  independent  positive  random  variables  with 

density  tunctions  f^(c^). 

We  denote  a random  variable  by  an  upper-case  letter  (e.g.  C)  and 
use  the  corresponding  lower-case  letter  (e.g.  c)  for  a particular 
value  that  the  random  variable  assumes.  The  problem  is  to  find 

a confidence  limit  for  C,  c , such  that 

P 

ProbjC  < c } = p , [Eq.  2] 

P i 

p is  usually  called  the  confidence  level  (or  coefficient)  and  has 

to  be  decided  in  advance. 

1 

g(c)  is  the  density  function  of  C and  c means  that  c is 
dependent  on  the  chosen  p.  ** 


FIG  1 DEFIN'T'ON  OF  CONFIDENCE  LIMIT 
1 . 2 Outline  of  the  Method 

The  method  presented  in  this  paper  can  be  divided  into  four  stages: 

a.  Transform  the  problem  of  finding  confidence  limits 
for  a product  of  independent  positive  random 
variables  to  the  problem  of  finding  confidence 
limits  for  a sum  of  new  independent  (not 
necessarily  positive)  random  variables  by 
means  of  a logarithmic  transformation. 
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b.  Find  the  characteristic  function  (related  to  the  moment 
generating  function)  as  a sum  of  independent  random 

var  fables.  When  this  is  done,  the  distribution  of 
the  sum  is  in  principle  determined. 

c.  Determine  the  cumulants  (related  to  the  moments)  of 
this  distribution  by  means  of  the  characteristic 
function. 

d.  Apply  a method  of  finding  the  inverse  of  a distribu- 
tion function  by  means  of  its  cumulants. 


1.  3 Transformat  ion 

Already  at  this  stage  something  can  be  said  about  g(c):  under 

fairly  general  conditions  [Ref.  5]*  g(c)  will  approach  the 
lognormal  distribution  as  N *♦ »,  independent  of  the  form  of  the 
density  functions  for  the  factors  fi(c£).  This  is  a direct 
consequence  of  the  central  limit  theorem. 

Figure  2 shows  the  shape  of  the  lognormal  distribution. 


In  most  practical  cases  it  is  impossible  or  very  difficult  to 
find  g(c)  by  analytical  means  (for  example,  transformation  by 
the  characteristic  function  or  by  the  Jacobian  determinant)  for 
most  of  the  distributions  (binomial,  poisson,  normal,  etc. ) 
assumed  for  fi(ci)  (the  rectangular  and  beta  distributions 
are  exceptions  in  some  cases,  see  Ref.  3)» 

The  approach  in  this  paper  uses  a general  method  to  determine  an 
arbitrary  distribution  developed  by  Cornish  and  Fisher  in  1937 
[quoted  in  Refs.  6,  7 and  9j  , which  takes  as  its  starting 
point  the  normal  distribution  and  finds  a transformation  between 
this  well-known  distribution  and  the  actual  distribution.  This 
transformation  is  a function  of  the  higher  moments  of  the  actual 
distribution.  The  Cornish-Fisher  approximation  [copied  from 
Ref.  9]  is  given  in  Appendix  A. 
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The  important  point  at  this  stage  is  that  the  method  is  based  on 
the  normal  distribution.  Therefore  we  will  transform  the  stochastic 
variable  C into  a new  stochastic  variable  Y so  that  instead 
of  g(c)  we  will  have  a new  density  function  h(y),  which  will 
approach  the  normal  distribution  as  This  transformation 

is  simply  Y = In  C: 


Adding  the  additional  assumption  > 0 
new  set  of  independent  random  variables 


Y. 

3. 


= On  C. 

1 


we  are  able  to  define  a 
as 


[Eq. 


3] 


and 


Y 


N 

E 

i=l 


Y. 

l 


[Eq.  4] 


X • • 

Then  Eq.  2 is  equivalent  to  (the  function  e is  single  valued 
and  monotone) 


ProbjY  < yp}  = p , 

where 

yp  = ^ cp  • [Eq*  5] 

The  problem  is  now  to  find  yp,  the  confidence  limit  for  a sum  Y 
of  independent  random  variables  Y^,  Where  the  density  function 
h(y)  of  Y will  approach  the  normal  distribution  as  N**», 
because  of  the  central  limit  theorem. 

We  have  now  assured  that  the  Cor nish-Fisher  method  will  be  accurate 
for  large  N.  For  small  and  moderate  N nothing  can  be  said  in 
general,  except  that  the  more  h(y)  differs  from  normality,  the 
more  inaccurate  the  method  will  be.  The  cases  of  small  N is 
discussed  later  in  the  special  case  where  f^(ci)  is  beta 
distributed. 


1.4  Cum  ul  ants 

In  this  section  we  will  make  use  of  the  characteristic  function 
of  a random  variable,  X,  defined  as 

cp(t)  = J*  eltx  f(x)  dx  , [Eq.  6] 


where  t is  a real  variable,  i = (i  is  also  used  as  a 

suffix,  this  is  believed  not  to  create  confusion)  and  f(x)  is 
the  density  function  of  X. 
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We  will  also  use  the  characteristic  function  of  a function  of  the 
random  variable  X,  g(x),  defined  as 


Cpg(t)  = J eits(x)  f(x)dx.  [Eq.  7] 

_• 

The  cumulants  Ki  , Ks K are  defined  formally  by  the 

ident Lty  in  t 

exp  1 K jt  + Kg  |y  + ...  + K,  £y  + . . . } = 

~ i + t + Hg  Yi  + • ' • + ~ + * * * * 

where  is  the  r^*1  moment  about  the  origin: 

4*  = J xr  f (x)  dx  , 
which  is  assumed  to  exist. 

Provided  an  expansion  in  power  series  exists  for  in  cp(t)  it  can 
be  shown  that 


Ki  it  + K2 


+ K 


r 


= Bn  cp(t ) . 


[Eq.  8] 


Therefore 


K *■ 

r 


.r 


d(it)r 


Bn  cp(t ) 


t=0 


Bn  cp(t)  is  called  the  cumulant  generating  function  or  just  the 
cumulative  function. 


As  mentioned  before,  the  Cornish-Fisher  method  is  a transformation 
of  a variable.  The  transformation  used  is  a function  of  the  first 
six  cumulants  of  the  distribution  h(y)  (see  Appendix  A). 

Appendix  B derives  a means  by  which  these  first  six  cumulants 
of  the  distribution  of  Y may  be  determined  from  a knowledge  of 
the  distribution  functions  of  the  C^. 


1. 5 Inversion 


Once  the  first  six  cumulants  of  the  distribution  of  Y are 
available  to  us  we  may  make  use  of  the  Cornish-Fisher  expansion 
(Appendix  A)  to  find  r. , approximation  for  the  required  confidence 
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limits,  i.e.  that  value  y from  which  prob(Y>yp)  = p.  This 
can  be  transformed  back  into  a value  of  Cp  by  means  of  Eq.  5» 

It  is  of  interest  to  note  that  Cor nish-Fisher  asymptotic  expansions 
can  also  be  used  to  derive  an  estimate  of  the  distribution  function 
of  a stochastic  variable,  which  is  the  sum  of  random  variables 
drawn  from  different  distributions,  so  in  principle  we  are  not 
limited  to  finding  only  the  confidence  limits,  although  this  is 
possibly  the  most  important  application. 


1.6  A Limitation  of  the  Method 


Applications  of  the  method  are  dependent  on  whether  the  integral, 
which  appears  in  Appendix  B, 

Ii(s,  0)  h J (&  Ci)°  • f^(c^)  • dc^  , s ^ 1 integer 

can  be  evaluated  (exactly  or  approximately). 

Since  C±  £ 0 by  definition,  the  only  point  where  2/n  c^  is  not 
defined  is  ci=0.  Therefore  a necessary  condition,  viich  is  not 
always  fulfilled,  is  that 


f . (c  . ) 
x ' x ’ 


0 


in  such  a way  that  the  integral  Ii(s,  0)  is  finite.  A way  of 
avoiding  this  problem  is  to  redefine  Ci  such  that  Ci  > 0. 
Examples  of  both  these  cases  will  be  given  later.  If  Ci  has 
no  upper  limit  then  we  must  also  have  as  a necessary  condition, 
that  Ii(s,  0)  be  finite.  If  Ci  is  a discrete  random  variable 
we  must  define  Ci  > 0 or  have  fi(0)  ~ 0. 


2.  APPLICATIONS 


2.1  Confidence  Limits  for  Products  of  Probabilities 


2.1.1  Product  of  probabilities 

In  this  case  the  factors  Ci  in  the  product  C are  probabilities, 
so  fi(ci)  = 0 for  c^  outside  the  interval  (0,  1).  The  family 
of  integrals  that  we  must  evaluate  in  order  to  apply  the  method 
can  thus  be  written  as 


Let  us  assume  that  for  each  factor  there  have  been  run  a 

number  of  trials  to  estimate  Cj  . Let  the  number  of  trials  be  n^ 
and  the  resultant  number  of  successes  be  xi,  where  X£<n-j. 

The  number  x^  will  then  be  binomially  distributed. 

If  c£  is  the  true,  but  unknown,  probability  on  which  we  would 
like  to  place  confidence  limits,  then  the  probability  of 
successes  from  n^  trials  is 

n . 
x 

x . 
x 

We  adopt  a Bayesian  approach  to  find  the  posterior  density  distri- 
bution f i (cj 1 x{,  ni),  in  which  the  number  of  successes  of 

the  n^  experiments  are  given.  The  posterior  distribution  will 
be  used  in  Eq.  9 instead  of  fi(ci).  The  density  function 
fi(cilxi*  nj.)  expresses  our  limited  knowledge  about  the 
probability  c^. 

From  Bayes*  theorem  we  have 


B( 


x . n , 


«i>  - < 


x.  n.-x. 

c.  • (1  - C. ) , 


[Eq.  10] 


f . ( c. . x . , n.  ) — 
x ' x ' x*  x 


B(x . I n . , c.)  f.(c.) 
v i 1 xJ  x 7 x v x ^ 

T B(x.|n.,  c.)  f . (c . ) dc . 
J v x ’ x*  x7  x v x ' x 


[Eq.  11] 


Without  any  a priori  knowledge  of  Cj  we  have,  according  to 
Bayes*  postulate: 


f . (c.  ) = 1,  0 < c.  < 1 . 

x'  X7  3 X 


[Eq.  12] 


This  is  the  most  important  assumption  in  this  application  and 
means  that, bef ot e we  have  carried  out  any  trials  to  determine  the 
value  of  Ci,  any  value  of  between  0 and  1 must  be  accepted 

as  equally  likely. 

Substituting  Eq.  10  and  Eq.  12  into  Eq.  11  gives 


x . 

X 

which  is  the  beta  distribution. 

When  we  use  this  density  function  in  the  definition  of  the  family 
of  integrals,  Eq.  9,  we  obtain 


Ii(s,  0)  = (n.+l  ) (xx)  | 1 (fc  c.)S  c.i(l  -c^  1 1dc.  [Eq.  14] 

1 wo 


f.(c.|x.,n.)=  (n.+l)  ( 
x'x  x*  x7  ' l 7 ' 


x.  n.-x. 

) c.1  (1-c.)  1 1 


[Eq.  13] 
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In  Appendix  C this,  is  shown  to 
easier  form  of 


I, (s,  0 ) = (-1)^  si  (n • +1 ) ( 1 ) 

X i A • 

1 

We  are  now  in  a position  to  use  the  general  method  given  previously. 

A computer  program  has  been  written  and  to  check  and  determine  the 
accuracy  of  the  method  for  this  application  a comparison  has  been 
made  between  values  obtained  by  the  method  and  corresponding  exact 
values  for  the  important  cases  of  small  sample  sizes,  n^,  and  few 
factors,  N,  in  the  product.  The  exact  values  have  been  obtained 
by  deriving  an  analytical  expression  in  each  case. 

TABLE  1 

MAXIMUM  ABSOLUTE  ERROR  FOR  UPPER  AND  LOWER  CONFIDENCE  LIMITS 

OF  1Q%  EACH.  MULTIPLIED  BY  IQ3 


>S\N 
n .\ 

i \ 

1 

2 

3 

4 

0 

11.7 

1.4 

0.25 

0.045 

1 

6.2 

0.90 

0.22 

0.054 

2 

4.2 

0.b5 

0.  IS 

- 

3 • 2 

0.51 

- 

- 

Each  cell  in  the  matrix  above  contains  the  maximum  absolute  error 
for  all  combinations  of  x£  and  n£,  where  at  least  one  n^  has 
the  value  shown  in  the  left-hand  column.  Table  1 is  valid  only 
for  upper  and  lower  confidence  limits  of  10$  each  and  only  for 
this  application. 

The  main  conclusion  from  Table  1 is  that  the  accuracy  is  sufficient 
for  all  practical  applications,  because  usually  N will  be  greater 
than  2 or  3«  One  could  have  feared  that  the  error  in  Table  1 would 
begin  to  increase  outside  the  frame  of  the  table.  This  is  not  so, 
as  shown  in  Appendix  D.  The  highest  value  of  the  error  will  be  in 
the  cell  (n£,  N)  = (0,  1).  However,  Appendix  D has  not  taken  into 
account  the  computational  inaccuracy. 

Simpler  and  more  accurate  methods  exist  for  the  case  N = 1 and 
finding  confidence  limits  for  cases  where  the  values  of  n^  are 
large  is  usually  of  little  interest. 


reduce  to  the  computationally- 


vxi  (n\Xi) 

£ (-DJ  • 

j=0  (Xi+j+l)s+1 


[Eq.  15] 


The  material  of  this  section  makes  use  of  Bayes’s  theorem  (or  formula) 
and  Bayes’s  postulate.  Bayes’s  postulate  is  far  from  being  univer- 
sally accepted  and  therefore  the  application  in  this  section  should 
be  seen  in  this  light. 

An  example  is  given  in  Appendix  E. 


2.1.2  Cumulative  probability  curve 

The  application  in  this  section  is  also  based  on  Bayes’s  theorem 
but  instead  of  using  Bayes’s  postulate  we  use  an  extension  of  this 
postulate  proposed  by  H.  Jeffrey  (1948)  [Ref.  8]  and  therefore  the 
method  in  this  section  is  on  an  even  looser  ground  than  the  appli- 
cation in  Sect.  2.1.1. 


Let  a process  (for  example  a detection  process)  be  dependent  on  a 

continuous  parameter  t (for  example  a distance  or  time)  and  let 

the  index  i in  Sect.  2.1.1  represent  an  interval  of  the  parameter  t 

from  t . , to  t . : 
l—l  i 


i 


-+ 

t . 

l 


t 


and  call  xx  the  number  of  failures  of  r»i  trials  run  when  t 
was  known  to  be  in  the  interval  i.  Then,  following  the  notation 
in  Sect.  2.1.1,  where  C£  now  is  the  (unknown)  probability  of 
failure  for  the  interval  i, 

N 

1 - n C.  [Eq.  16] 

i=l 

represents  the  probability  of  at  least  one  success  in  all 
intervals  i = 1,...,  N.  We  have  assumed  that  an  outcome  (success, 
failure)  in  one  interval  is  independent  of  outcomes  in  other 
intervals. 

Section  2.1.1  can  be  used  again  to  find  confidence  limits  for 
Eq.  16,  except  that  C£  is  no  longer  a random  variable  of 
which  we  have  no  a priori  knowledge.  It  is  possible  in  this  case 
to  say  something  about  ci  a priori  because  of  the  independence 
assumption  stated  above. 

Define 

y(t)*dt  = probability  of  success  in  a small  interval 
dt  around  t. 

c(t . ,,t)=  probability  of  failure  in  the  interval 
1 ti_1  to  t. 

Then  we  can  set  up  a differential  equation  using  the  independence 
assumption 

c(ti_i»  t)  = c(t±_1,  t - dt)[l  - y(t)dt] 
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leading  to 


ci  = exp  ^ 


-f1 


Y (t ) dt 


V.  t 


i-1 


[Eq.  17] 


Y is  now  the  basic  random  variable  of  which  we  assume  to  have  no 
a priori  knowledge. 

It  is  noted  that  Eq.  17  is  not  a one-to-one  transformation  between 
C£  and  Y ( "t- ) • For  a given  c£  there  is  an  infinite  number  of 
functions  y(^)  that  satisfy  Eq.  17,  but  not  vice-versa.  Therefore 
Eq.  17  is  regarded  to  contain  some  information  about  ci,  and  y is 
regarded  to  be  a more  basic  variable  than  c^. 

Since  0<Y<"  we  must  use  the  Jeffrey  extension  [Ref. 8]  of  Bayes1 
postulate,  which  states  that  if  y ranges  from  0 to  • the  prior 
distribution  is  taken  as  proportional  to  dy/y  (instead  of  just  to 
dY  as  in  the  original  postulate  of  Bayes,  which  deals  only  with  a 
finite  definition  range): 

g( Y )dy  = ^ , 0 5 Y • 

To  find  the  density  function  f^(c^)  of  c^  the  (Jacobian) 
transformation 


fi(ci)  I dc± | = g(Y ) I dY | 
gives,  instead  of  Eq.  12, 


K 


f.(c.)dc.  = t dc.,  K negative  constant  . 

l v l ' i c^  !m  c^ 

Proceeding  as  in  Sect.  2.1.1: 

cXi”1(l-c.)ni  X± 

£t(cilxi>  "i>  “ wpr)  *.  — - [Eq-  18  ] 

where,  provided  IS  x^  < n^  - 1 

pi  Xj-1  n • — x • dc.  ni"*Xi  ./n.-x.\ 

2<xi>  ni>  - J ci  -d-Ci)  eTcf-'  S <-x>  A j / 'to(J+xi) 

o i j=0  - 


[Eq.  19] 


n . -x . 

l i 


i(s*0)  = qTx-1;-.!  j Ci1  ,(i-ci)  1 4 •frc1)l,"x  -dc. 

ii  o 

ni-xt  . (t*1) 

Z 1 -j-4  , 


i.(s,  o)  = 

i * Q(xi,  ni) 


s 2 l . 


j=0 


(j  +*Lr 


Again  wt  are  in  a position  to  use  the  general  method  given 
p.  eviously. 

During  the  deter m i nat i on  of  Q(x-£,  n^)  the  following  restrictions 
on  xj  were  necessary 


1 < x.  < n.-  1 . 
1 

It  can  be  shown  that 


x . n . 


i l 


0 , 


all  s ^ 1 . 


Although  from  Eq.  18  and  Eq.  19  we  have 


I.  (s,  0)  = lim  I iiSrl),1. 

x ,•*  0 ' &n(x.  ) • x? 
t ' 2.  1 

this  application  is  not  defined  for  xi  = 0 (no  failures), 
since  a — ► - • (See  Appendix  A). 

It  means  that  this  application  can  be  used  only  if  there  is  at  least 
one  failure  in  each  interval  (xi>0).  The  mini-mum  values  of 
and  n^  is  (x^,  n^)  = (1,  1).  Intervals  with  no  successes 
(xi  ~ pl)  need  not  to  be  included  in  the  calculation  of  the 
cumulants,  since  they  do  not  contribute  to  Eq.  B.2  in  Appendix  B. 

The  need  to  have  a well-defined  binomial  process  (constant  n£ ) 
in  each  interval  determines  the  intervals. 

The  major  assumption  in  this  section  is  the  use  of  Jeffrey's 
extension  of  Bayes'  postulate.  Other  distributions  than  dy/y 
can  be  used  if  better  arguments  than  Jeffrey's  can  be  given. 

An  example  is  given  in  Appendix  E. 


CONCLUSION 


The  purpose  of  this  paper  was  to  present  a method  for  obtaining 
confidence  intervals  for  a product  of  independent  positive  random 
variables,  that  does  not  assume  a special  type  of  distribution 
for  the  factors  in  the  product.  This  is  done,  although  the  method 
is  only  approximative. 

The  problem  of  finding  confidence  limits  has  been  reduced  to  the 
task  of  finding  a family  of  definite  integrals  for  each  factor 


xn  the  product  separately.  The  important  thing  is  that  the  area 
of  work  has  been  transformed  from  the  product  to  each  factor. 

The  applications  show  that  in  some  important  cases  the  integrals 
can  be  found  exactly. 

In  the  application  where  the  factors  are  binomial  distributed 
the  accuracy  is  acceptable  even  for  very  small  sample  sizes. 

All  the  applications  given  can  be  used  down  to  sample  sizes 
from  zero  to  two  depending  on  the  application. 

in  general  the  accuracy  of  the  method  is  no  better  than  that  of 
the  asymptotic  expansion  used  an4  is  sometimes  worse  due  to 
additional  computation  before  applying  the  expansion. 

The  method  cannot  be  upheld  as  an  elegant  mathematical  technique. 
It  was  believed  that  elegance  should' be  sacrificed  in  favour  of  the 
development  of  a usable  general  method. 
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APPENDIX  A 


ASYMPTOTIC  EXPANSION  FOR  THE  INVERSE  FUNCTION 
OF  AN  ARBITRARY  DISTRIBUTION  FUNCTION 


For  a detailed  discussion,  see  Refs.  A.l,  A. 2.  A briefer 
description  is  given  in  Ref.  A. 3,  as  follows: 

Let  the  distribution  function  of  a stochastic  variable  Y be 
denoted  by  F(y)  and  its  cumulants  by  Kr . Then  the  (Cornish- 
Fisher)  asymptotic  expansion  for  the  value  of  y such  that 
F(y  ) = 1 - p is  means  asymptotically  equal1) 


y /%>  m + o • w 


wher  e m = Kx  , 


= >/£ 


and 


w = x + Yi  hx  (x) 

+ Yah2  (x)  + Yihu  (x) 

+ Y3h3  (x)  + Yi  Yahla  (x)  + Y?hm  (x) 

+ Y4b4(x)  + Yahaa(x)-+-  YiY3h13(x)+  Yi  Ya  hn-,  (x  ) + Yi  hun  (x  ) 

-L 


K 

r 

where  Y is  defined  as  Y ~ = — . r £ 3 

r-2  r ’ 

o 

and  x determined  by 


[Eq.  A.l] 


— “ r expi-  Vi dt  = p 

\ 2 


x is  the  fractile  of  the  normalized  normal  distribution.  The 
terms  on  each  line  in  Eq.  A.l  are  the  same  order  of  magnitude. 
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hi  (x) 

= 

£ He2  (x) 

h2  (x) 

= 

T\  He3 (x ) 

hu  (x) 

= 

- [2He3 (x ) + Hei  (x)] 

h3  (x) 

= 

120  He*  ^ 

(x) 

= 

- yz  CHe* (x)  + Hes (x)3 

hu  l (x  ) 

= 

y|j  [12  He4(x)+  19  He2  (x )] 

h4  (x) 

= 

720  Hes(x) 

h22 (x) 

= 

- jq;  [3He s(x)  + 6He3(x)  + 2He!(x)] 

hi3  (x) 

= 

- C2^e s(x)  + 3He3(x)] 

hus  (x) 

= 

[l4He5(x)  + 3 7He3  (x ) + 8Hei  (x)] 

hmi  (x ) 

= 

- [2S2He5(x)  + 8 32He3  (x ) + 227HC,  (x)]. 

The  Hen(x)  are  the  Hermite  polynomials,  which  can  be  calculated 
recursively  from 

i 

Heo(x)  = 1,  Hex  (x)  = x 

and 

Hen+i(x)  = x*Hen(x)  - n'Hen_i(x),  nil. 
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APPENDIX  B 


THE  FIRST  SIX  CUMULANTS 


We  will  find  expressions  for  the  first  six  cumulants  by  means 
of  the  characteristic  function  cp.y(t ) of  Y. 

Using  Eqs.  4 and  6 of  the  main  text  and  the  assumption  that  the 
Y^  are  independent  leads  to 

N 

cpY(t)  = n cpy  (t)  , 

i=l  i 


where  cpY>(t)  is  the  characteristic  function  for  the  Y^. 

Using  Eq.  7 of  the  main  text  for  the  function  in  Eq.  3 of  the  main 
text  gives 


cpy  (t)  = Jexpjit^ci}  f.Cc.)  dci  , 


with  limits  of  integration  from  zero  to  infinity;  thus 


cp. 


^(t)  = f cf  fl(c.)  icv 


[Eq.  B.l] 


From  £q.  8 of  the  main  text  we  have 


Bn  tp, 


Denoting  the  cumulants  of  the  distribution  of  Yi  by  p we 

have  again  from  Eq.  8 of  the  main  text  * 


«Py  (t)  = S K.  , 

Xi  r»l  1,r  r 1 


leading  to 


N 

Z K. 
i=l  1 


[Eq.  B.2] 
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can  be  determined  by 


K. 

i,r 


k.  . r 

-•  n 


(t)1 

1 J t=0 


Bm  Cpv 

|_d(it)r  Y 


Substituting  Eq,  B.l  in  Eq.  B.3  gives 


K. 

i»r 


d(it)1 


cf  ^ f . (c . )de . 
1 l''  i7  1 


J J 


t=0 


Define  a function  I^(s,  t)  as 


Ii(s,  t)  = J (Bn  c.)s  c fi(ci)  dc±  , 

then  we  have  the  property  of  I^(s,  t): 

t)  = I±(s  +1,  t). 
Equation  B.4  can  then  be  written  as 


K. 

i»r 


d(it)r 
From  Eq.  B.5  we  find 


Cm  Ii(0,  t) 


t=0 


[Eq.  B.3] 


[Eq.  B.4] 


[Eq.  B.5] 


i,i 

= 

o). 

1,3 

= V2' 

0) 

- KS.  . 
1,1 

i»3 

= IjO, 

0) 

a 

- 3K.  • K , - K.  . 

1,1  i,a  1,1 

- It(4, 

0) 

“ 4K.  . K.  - 6 K8,  , K.  - 3K»  „ - . 

1,1  1,3  i,l  i,3  1,3  1,1 

1,5 

= ^(5, 

0) 

- 5K.  • K,  -10K.  • K.  -10K*  • K.  . 

1,1  i,4  1,8  1,1  i,i  1,3 

-15K. 

1,1 

•Kai,a-l0lc>i,1-Ki,a-  Kl.- 

i,e 

= ii(o. 

0) 

- 6Ki,rKi.r  l5Ki,a-|Ci,.-  15Kl,rKi,. 

-00K,  . K.  *K.  - 20K*  , • K . - 45k!  »Ka. 

1.1  i,a  1,3  1,1  i,3  1,1  i,a 

• K.  - 10ld  - 15K*  - kJ  . 

1.1  1,3  i,3  i,s  i,i 
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K (r  = 1,...,  0)  can  now  be  determined,  provided  I± ( 
Ii(s,  o)  = J (fin  ci)S  • dci 


\ 


I 


, 0)  can 

[Eq.  B.6] 
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APPENDIX  C 


EVALUATING  A DEFINITE  INTEGRAL 


Dropping  the  index  i in  Eq.  14  of  the  main  text  we  have : 
l(s,  0)  = (n  + 1 ).(”)•  J1  (fin  c)S.  cx  • (1  - c )n-x  • dc  [Eq.  C.l] 


(l-c)n  x can  be  expanded  in  a binomial  series: 


n-x 

a*n-x  V /n-x\  / \K 

-c)  = Lj  ( . ) (-c)  . 


k=0 


Using  Eq.  C.2  in  Eq.  C.l: 


[Eq.  C.2] 


I(s,  0) 


(n+l  )•  ("  )* 


“z  (-l)k-(n-X)-  f1  (<kc)s  •Cx+k 

k=0  K 


dc  . [Eq.  C.  3] 


From  standard  integral  tables: 


x+k+1 
c 

* x+k+1 


S (-l)J-(toc)S~j 

j=0  (s-j)  I • (x+k  + l)j 


Therefore 


J'  (Sm  c )s  . cx+k«  dc 


-g.  . 

+ k + 1 


(-D5  . 

(x  + k + l)s 


[Eq.  C.4] 


Substituting  Eq.  C.4  into  Eq.  C. 3 gives  Eq.  15  of  the  main  text. 
As  a partial  check  we  must  have  1(0,  0)  =1. 
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APPENDIX  D 


LOCATION  OF  THE  MAXIMUM  ERROR 


From  Eq.  B.2  in  Appendix  B we  see  that  Kp  is  of  order  N; 
we  may  write 


Kr~  0(N)  . 


Then 


Y 


r-2 


/v/  ©(N1 


r ^ 3 

(see  App.A) 


[Eq.  D.l] 


The  y*s  are  a measure  of  the  deviation  of  h(y)  (see  main  text) 
from  the  normal  distribution.  Because  of  Eq.  D.l  the  absolute 
error  will  then  decrease  monotonically  as  N increases. 


Therefore  a maximum,  apart  from  (n£,  N)  = (0,  1),  must  lie 
in  the  column  N = 1 of  Table  1 of  the  main  text,  if  a maximum 
exists.  For  N = 1 the  density  function  h(y)  for  Y = 2n  C 
is  (dropping  the  index  i): 


h(y)  = (n+1  )•(”)•  (1  - ey)n~x  . 


This  function  has  a maximum,  except  for  x = n,  which  is  the 
case  that  differs  most  from  the  corresponding  normal  density 
function  with  the  same  mean  and  variance.  This  does  not  change 
with  increasing  n.  Therefore  the  maximum  error  will  always 
come  from  this  family  of  curves  (x  = n). 


For  these  curves  we  have 

lS 


l(s,0)  = 

(n+1)5 


giving 


K = 
r 


(n+1)1 


A is  a constant 


rr-2 


K 


r/8 


= A^-r/a  , r*3  (see  App.A) 


[Eq.  D.2] 
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Hence  the  Y’s  are  constants,  independent  of  n.  The  argument  y 
in  Appendix  A can  then  be  written  ^ 

y /v  (A  is  a new  constant  > 0). 

p n + 1 

So  the  argument  for  C will  then  be 


c 

_£ 


ru  e 


A 

n+1 


(see  Eq.  5 of  main  text). 


A direct  evaluation  of  c is 

P 


g(c)  = (n  +l)cn 
c 

J*  (n+1)  cndc  = 1-p  =>  cpn+1  = 1-p 
o 


Bm(_  1-p  ) 


The  difference 

A Sm(j.-p) 

n+1  n+1  * 

e - e 

which  is  the  error  given  in  column  N = 1 in  Table  1 of  the  main 
text,  has  only  one  maximum  at  n = 0 and  tends  to  zero  as  n •*  • . 

Looking  at  Table  1 in  the  main  text,  the  error  is  seen  to  fall  off 
more  slowly  vertically  than  horizontally.  This  is  because  the 
y*s  (which  are  related  to  the  error)  remains  constant  (D.2)for 
increasing  n (N  fixed)  whereas  the  Y*s  decreases  (D.l)  towards 
zero  for  increasing  N (n  fixed). 


APPENDIX  E 


TWO  NUMERICAL  EXAMPLES 


E.l  Product  of  Probabilities  [Referring  to  Sect.  2.1.1  of 

main  text] 

Maritime  patrol  aircrafts  (MPA)  detected  submarines  ten  times  in 
certain  ASW  exercises.  On  these  ten  occasions,  the  submarine  was 
localized  eight  times.  Another  sample  shows  that  out  of  nine 
occasions  when  submarines  were  localized,  seven  resulted  in  an 
attack  on  the  submarine  by  the  aircraft.  A third  sample  shows 
that  out  of  four  occasions  when  the  submarine  was  attacked  three 
were  evaluated  as  successful.  The  above  data  are  hypothetical 
and,  for’  sake  of  comparison,  have  been  made  equal  to  the  data 
used  in  the  example  in  Ref.  E.l. 

The  question  is:  What  is  the  confidence  limits  for  the  true 

probability  that  the  submarine  will  be  killed,  given  that  the 
air-craft  has  detected  it? 

Hie  result  is: 


TABLE  E.l 
AN  EXAMPLE  : 

PRODUCT  OF  PROBABILITIES 


THIS  METHOD 

EXACT  VALUE 
[Ref .E.l] 

ERROR  IN  % 

UPPER  10a 

0. 54270669 

0.54224843 

+ 0.085 

MEDIAN 

0. 35047715 

0.35666951 

- 0.054 

LOWER  1055; 

0.19459113 

0.19460653 

- 0.0079 

The  error  consists  of  both  computational  error  and  error  due 
to  the  approximation. 


E.2  Cumulative  Probability  Curve  [Referring  to  Sect.  2.1.2  of 

main  text] 

Sample  tracks  have  been  collected  of  a submarine  and  a sonar- 
fitted  surface  ship  trying  to  make  a sonar  detection  of  the  sub- 
marine (Fig.  E.l).  A dot  means  a detection.  From  this  sample 
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we  want  confidence  limits  for  the  true  probability  of  the  ship 
detecting  the  submarine  outside  a given  range.  This  example  could 
as  well  have  treated  radar,  visual  or  ECM  detections  or  other  kinds 
of  detections.  The  range  is  divided  into  brackets  containing  a 
constant  number  of  opportunities  n^. 

Figure  E.2  gives  a general  picture  of  each  interval. 


1 < x . < n . 

1 x 

F>G  E 2 INTERVAL 


Table  E.2  is  produced  from  Fig.  E.l,  leaving  out  intervals  with 
no  detections. 


TABLE  E.2 


AN  EXAMPLE  : 

CUMULATIVE  PROBABILITY  CURVE 


By  using  the  method  explained  in  Sect.  2.1.2  of  the  main  text  on 
the  data  in  Table  E.2  we  obtain  Fig.  E.3*  Figure  E.3  also  marks 
with  crosses  the  confidence  intervals  corresponding  to  a division 
of  the  range  into  brackets  of  10  units  starting  with  range  0. 
Without  relation  to  the  sample  such  a division  is  arbitrary  and 
creates  difficulties  in  choosing  an  appropriate  value  for  the 
number  of  opportunities. 


REFERENCE 

E.l  SPRINGER,  H.D.  & THOMPSON,  W.E.  Bayesian  confidence  limits 
for  the  product  of  N binomial  parameters.  Biometrics  53> 
1966:  611-613. 


FIG.  E.3  CONFIDENCE  LIMITS 
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APPENDIX  F 


CONFIDENCE  LIMITS  OF  A FUTURE  OBSERVFI/  VALTIE  OF  PRODUCTS 
OF  PROBABILITIES.  BASED  ON  A SAMPLE  ALREADY  TAKEN 


F.l  Product  of  Probabilities 


Usually  an  estimate  for  the  value  of  product  C is  wanted. 

Instead  of  choosing  a suitable  ewtimator  (mean,  median,  mode  etc.  ) 
based  on  the  distribution  of  C a much  simpler  (but  not  as 
satisfactory)  procedure  is  often  followed,  namely,  that  of  choosing 
a suitable  estimate  for  each  of  the  factors  Ci  in  the  product  C 
and  then  forming  the  product.  If  Ri  successes  af'e  observed  out 
of  m^  opportunities  for  each  factor  i,  the  maximum  likelihood 
estimator  for  the  true  probability  of  success  is  R^/nu. 

Let 


C. 

1 


1 < R.  S m. 
1 l 


R^  is  a random  variable 
m^  is  a non-random  variable. 

We  are  interested  in  finding  confidence  limits  for  the  product 
N 

n R.  /m . 

i=l  1 1 


given  that  we  have  observed  X£  successes  out  of  n^  opportunities 
for  each  factor  i (as  in  Sect.  2.1.1  of  the  main  text). 

The  m|s  could  be  a future  sample  to  be  collected  and  (x£,  n^) 
could  be  a sample  already  obtained.  If  z±  successes  are  observed 
out  of  m^  outcomes  in  the  future  sample,  then  the  product 


N 

z = H z . /m. 
i=l  1 1 

lies  inside  the  confidence  limits  for  the  product 
N 

II  R.  /m. 

i=l  1 1 

in  a certain  fraction  of  cases  (dependent  on  the  confidence  level) 
under  the  hypothesis  that  the  underlying  process  (generating  successes 
and  failures)  is  the  same  for  the  sample  obtained  (x^,  ni)  and  for 
the  future  sample  (z±,  n»i). 

Therefore  if  the  product  z lies  outside  the  confidence  limits  it 
is  an  indication  that  the  underlying  process  has  changed  between 
the  sample  (x^,  n^ ) and  (zj.,  mi).  ' 
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The  inequality  1 < R^  < m^  means  that  we  assume  continuing  the 
expsriments  until  we  have  at  least  one  success  (Ri^l)  for 
each  factor  i.  (R^/mi  is  then  no  more  the  maximum  likelihood 
estimator).  The  reason  is  to  overcome  the  singularity  at  Ri=  0. 
This  assumption  is  not  necessary  for  the  sample  (x-£,  n^).  Our 
knowledge  about  the  true  probability  of  success,  Cf,  is  in  the 
form  of  the  distribution  for  cf  (the  beta  distribution) 


(n.+l)*(  )•  c.  • (1  — c . ) 

v x ,vx.  x v x ’ 
x 


X . 


n . -x . 
x x 


[Eq.  F.1] 


The  probability  that  R^  will  take  the  value  a is 


m . -1 


/ X \ CL— 1 \ X 

va- 1 ' x v x' 


m . -a 


Therefore 


P1  mi  s m.-l  , m.-a 

i(s,0)  = J £ fr  jf)  •(*  J.cJ-Ml-c.)  1 • 

J0  a=l  mi  a-x  X x 


n . x . n . -x . 

i.  _ x./,  _ v x x.dc 


•(n.+l)*(  )*c.  • ( 1— c . ) 

' x /vx.  x v x7 

X 


m.-l  /m . - 1\ 

n.+  1 n.  1 l i i J - i 

I.  (s,  0)  = — • ( )•  £ ; • («n  — ) 

i mi+ni  xi  a=l  M mi 

la+xi-  V 


Of  interest  is  the  case  m^  = n^,  which  is  implemented  in  the 
computer  program. 

The  case  m£  = n^  means  that  a future  sample  is  to  be  collected 
with  exactly  the  same  sample  sizes  as  the  sample  already  obtained. 
Confidence  limits  for  the  product 


N 


n R,  /n . 


i=l 


x'  x 


[Eq.  F.2] 


can  then  be  used  as  a measure  of  uncertainty  for  the  product 


N 

11  x . /n . 
i=l  1 1 

already  obtained. 
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Therefore  confidence  limits  for  Eq.  F.2  is,  in  the  following, 
called  "limits  for  an  observed  value  of  the  product". 


EXAMPLE 

The  same  example  is  used  as  in  Appendix  E.  The  upper  and 
lower  10 % limits  for  the  observed  product 

_8_  y 7 y 3 
10  9 4 

is  given  in  Fig.  F.l,  together  with  confidence  limits  for  the 
true  probability  computed  in  Appendix  E. 


Probability 


TRUE 


0.542 


0.195 


OBSERVED 
T0. 721 


0.188 


FIG.  F.l  UM  TS  FOR  AN  OBSERVED  VALUE 


F. 2 Cumulative  Probability  Curve 

Calling  M number  of  failures  (Rj_  i 1 ) and  replacing  Eq.  F.l 
by  Eq.  18  of  the  main  text 


s m.-l  x— 1 


n . -x . 

v i l 


x, -1  n . -x . 

Jn  i s m.-l  x-1  n. -x  c.  • (1-c, ) 

£ (*>  )-cj  • (l”c j ) 1 •<*€., 

, x=l  ”i  X'1  1 1 Q(xi>ni).fe  c.  1 


m.-l 


Ii(s’0)  ~ e(x^,ni)  • (x-l>  '8(x  +xi-l,  mi+ni-l). 

Again  the  case  = n^  is  of  interest. 


Difficulties  have  been  encountered  in  computing  Q(x+x.-l,m , +n^-l ) 
accurately  for  larger  N(N>20),  therefore  another  expression 
for  Q giving  more  accuracy  than  Eq.  19  of  the  main  text  has 
been  used  in  the  computer  program.  A listing  of  the  computer 
program  and  test  examples  are  available  from  SACLANTCEN. 
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